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The theory of sexual selection suggests several possible explanations for 
the development of standards of physical attractiveness in humans. 
Asymmetry and departures from average proportions may be markers of 
the breakdown of developmental stability. Supernormal traits may pre- 
sent age- and sex-typical features in exaggerated form. Evidence from 
social psychology suggests that both average proportions and (in fe- 
males) “neotenous” facial traits are indeed more attractive. Using, facial 
photographs from three populations (United States, Brazil, Paraguayan 
Indians), rated by members of the same three populations, plus Russians 
and Venezuelan Indians, we show that age, average features, and (in 
females) feminine/neotenous features all play a role in facial attractive- 
ness. 
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THEORY AND HYPOTHESES 


Theory: Adaptive and Nonadaptive Mate 
Choice in Current Evolutionary Theory 


Evolutionary biologists have produced a considerable body of theory 
and evidence regarding mate choice for physical characteristics in non- 
human animals, while social psychologists have begun to test alterna- 
tive hypotheses of the criteria of facial attractiveness in different human 
populations. In this paper we discuss some current controversies about 
physical attractiveness in evolutionary theory and psychology, and we 
present some relevant results on standards of facial attractiveness from 
an ongoing study of the criteria and consequences of physical attractive- 
ness in five human populations. 

In 1871 Charles Darwin (1981) argued that the elaborate and brightly 
colored tail of the peacock, and similarly ornamental features in a host of 
other species, have evolved through sexual selection—that peacocks 
with elaborate tails have had more offspring, not because they enjoyed 
any advantage in the ‘‘struggle for existence,” but because they were 
more often chosen as mates by the other sex. Most of Darwin’s scientific 
contemporaries rejected his theory of sexual selection by mate choice; 
they were commonly skeptical that ‘lower’ animals could have any- 
thing like an esthetic sense. The topic was also mostly neglected by the . 
founders of the Modern Synthesis in the 1930s and 1940s (Cronin 1991). 
But the past twenty years or so have seen a full vindication of Darwin, 
and it is now almost universally accepted that elaborate secondary 
sexual characters are commonly the result of sexual selection by mate 
choice. 

But Darwin did not offer any account of the origin or adaptive value of 
the preferences expressed in mate choice. Since his time, several] pro- 
cesses have been proposed that might account for such preferences, but 
there is no consensus on which is likely to be most important. Facial 
attractiveness seems to present a special paradox for theories of sexual 
selection. Traits like body size and age may have direct consequences for 
survival and reproduction that make them relevant criteria for mate 
choice, but it is harder to imagine that subtle variations in facial mor- 
phology (at least among individuals of the same age) have strong direct 
fitness consequences. Below we will consider two possible theories that 
may circumvent this difficulty and may be particularly relevant to ex- 
plaining standards of human facial attractiveness. 
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Canalization and Adaptive Mate Choice. Waddington (1957) argued that 
the development of adaptations is typically canalized—guided by nega- 
tive feedback mechanisms adapted to keep development “on track” in 
spite of the possible perturbing influences of environmental insult and 
genetic load. But in the face of really powerful stresses, canalization is 
likely to break down, with several characteristic consequences, includ- 
ing departures from average proportions and fluctuating asymmetry. 


Departures from average proportions. Where size is concerned, the opti- 
mum may be different from the average, if greater than average size 
reflects a developmental history of good nutrition and low energy drain 
from injury and parasites. But where shape is concerned, departures 
from the average often reflect maldevelopment. Deutsch (1987) reviews 
evidence that many psychiatric syndromes are associated with facial 
dysmorphology (see also Garn et al. 1985). 


Fluctuating asymmetry. Developmental “noise” not only leads to dif- 
ferences between the features of affected individuals and those of aver- 
age individuals, but also (in bilaterally symmetrical species) to random 
differences between features on the right and left sides of affected 
individuals (or fluctuating asymmetry) above and beyond biologically 
normal (or directional) asymmetry. In nonhuman organisms, inbreed- 
ing, elevated homozygosity, parasite load, undernutrition, and expo- 
sure to pollution are all associated with increased fluctuating asymmetry 
(Parsons 1990). In humans, fluctuating asymmetry correlates with in- 
breeding, premature birth, psychosis, and mental retardation (Livshits 
and Kobylianski 1991). 

Both departures from average proportions and fluctuating asymmetry 
may reflect a history of environmental insult and genetic load, and 
forecast a future of reduced viability and fertility. Insofar as there are 
direct or indirect evolutionary advantages to choosing a healthy and 
fertile mate, selection will favor individuals who steer clear of mates that 
display departures from average proportions and fluctuating asymmet- 
ry. Koeslag (1990) reviews evidence for “‘koinophilia” (preference for 
average features), whereas Meller and Héglund (1991) and Thornhill 
(1992) provide evidence that scorpionlike and swallows look for symme- 
try in potential mates (see also Thornhill and Gangestad in this issue). 


Sensory Bias and Nonadaptive Mate Choice. It seems likely that adapta- 
tions for complex perceptual discriminations will usually have nonadap- 
tive biases built into them, and there is a growing interest in the possi- 
bility that some mate preferences are adaptive by-products rather than 
adaptations in their own right (Enquist and Arak 1993; Ryan et al. 1990; 
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Staddon 1975; ten Cate and Bateson 1989). Williams (1992) argues that 
preferences for exaggerated stimuli may be a nonadaptive by-product of 
asymmetrical fitness functions. (These functions have nothing to do with 
fluctuating asymmetry!) For example, if reproductively immature males 
have shorter tails than mature males, then a female preference for males 
with longer than average tails may be adaptive if it leads females to 
avoid matings with juveniles—better to err on the long side than on the 
short. But as a by-product, females may show a nonadaptive preference 
for mature males with long tails over mature males with short or average 
tails. Given heritable variation in male tail length, the result over time 
will be the evolution of exaggerated male tail length through female 
choice. 

A recent simulation by Enquist and Arak (1993) models the evolution 
of nonadaptive preferences as by-products of adaptation. The authors 
present a ‘‘neural network’’—a simple computer model of a retina and 
nervous system—with a long-tailed shape representing a mate of the 
right species, with a short-tailed shape representing a mate of the wrong 
species, and with random shapes. They make small random changes in 
the network and save those versions of the network that respond 
strongly to the right shapes and weakly to the wrong ones. By reiterat- 
ing this trial-and-error process in a simulation of natural selection they 
produce a network that distinguishes almost perfectly between mates of 
the right species (presented in a variety of orientations) and other 
stimuli. However, the network responds even more strongly to a few 
shapes—especially shapes that present the distinguishing features of 
the correct stimulus in an exaggerated form—than it does to the stimu- 
lus to which it was selected to respond! A further simulation shows that 
this nonadaptive sensory bias toward ‘‘supernormal stimuli” can persist 
and result in the evolution of exaggerated traits, even when these traits 
carry a moderate fitness cost. 

Staddon (1975) notes that animals trained by reinforcement learning 
to show a response to a stimulus commonly show an even stronger 
response to a version of the stimulus that exaggerates its distinguishing 
features—a phenomenon called “peak shift’’—and ten Cate and Bateson 
(1989) note an analogous phenomenon in the case of imprinting. 


Hypotheses: The Young, the Average, 
and the Supernormal 


Social psychologists and cultural anthropologists have often argued 
that human standards of physical attractiveness are culturally deter- 
mined—that standards in different cultures have little or nothing in 
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common, that one person’s standards are acquired by imitation or by 
social reinforcement learning, and that standards can be understood 
only within the context of the whole system of meanings and values 
particular to a given culture (Hatfield 1986; Polhemus 1988). We do not 
argue that imitation, social learning, and symbolism are always unim- 
portant in setting standards of physical attractiveness; however, several 
lines of evidence suggest that other things are going on. 


Age. First, there seems to be little variation across populations in the 
relationship between age and attractiveness (Buss 1989; Symons 1979; 
Williams 1966). By way of comparison, consider how much variation 
there is in ideal fatness, with obesity favored in some societies and 
leanness in others (Brink 1989; Brown 1992). Yet in no society, to our 
knowledge, are physical markers of reproductive senility like dry, sag- 
ging, and wrinkled skin considered attractive. Probably signs of aging 
are universally considered sexually unattractive because throughout 
human evolutionary history individuals with an attraction to reproduc- 
tively senescent mates left few surviving offspring. 

Second, variation in perceived attractiveness within age classes may 
also depend in part on the operation of precultural psychological mecha- 
nisms. Discrimination between attractive and unattractive faces devel- 
ops at a very young age. Langlois and her co-workers (Langlois et al. 
1987) have shown that infants as young as two to three months of age, 
given a choice between looking at photographs of women’s faces rated 
attractive by adults and women’s faces rated unattractive, will spend 
more time looking at the attractive faces. Several mechanisms could 
account for this agreement in standards of facial attractiveness between 
adults and unenculturated infants. 


Average Features and Symmetry. A number of studies suggest that 
faces with proportions especially close to average proportions are more 
attractive than most faces. Langlois and Roggman (1990) rely on com- 
puter graphics to produce composite faces that blend the features of a 
number of faces; these composite faces are rated more attractive than 
most of the original faces used to produce the composite. Benson and 
Perret (1992), using a more complex graphics system, show that the 
averageness effect results partly because the composite faces have 
smoother complexions, and partly because they have proportions close 
to average proportions. Other studies relying on measurements of faces 
also find that attractive faces are more average (Farkas and Munro 1987; 
Strzatko and Kaszycka 1992). All these studies suggest that the human 
mind may include a “face averaging device’’ which blends perceived 
faces to arrive at an ideal prototypical face (Symons 1979). 
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Some forthcoming work (see Thornhill and Gangestad in this issue) 
also suggests that symmetry is an important component of facial attrac- 
tiveness. 


Supernormal Stimuli. There is probably more to facial attractiveness 
than averageness. The same study by Langlois and colleagues which 
found that composite faces are more attractive than most of the faces 
that go into making the composites also found that a few individual 
faces are consistently rated more attractive than any composite (Alley 
and Cunningham 1991). Cunningham (1986) shows that photographs of 
female faces rated attractive in the United States have unusually large 
eyes, high cheekbones, narrow cheeks, and small noses, chins, and 
jaws. McArthur and Berry (1983), working in the United States, and 
Ried] (1990), working in Austria, both using computer systems normally 
used for police identification work, show that the ideal female face has 
a more “‘neotenous” (juvenile) appearance—larger eyes and more re- 
duced vertical dimensions—than the average female face, whereas the 
ideal male face is closer to the average male face. 

The interpretation of these results is problematic. Cunningham argues 
that men are attracted to women with large eyes, small noses, and small 
chins and jaws because these are juvenile traits. On the other hand, men 
are attracted to women with high cheekbones and narrow cheeks be- 
cause these are mature traits. Cunningham gives no clear rationale why 
this particular combination of juvenile and mature traits is favored, 
rather than some other, or even the opposite, combination. It may be 
relevant that all the traits listed above are traits that distinguish female 
from male faces (Enlow 1990). Male faces undergo a more thorough 
remodeling during adolescence than female faces, with a great expan- 
sion of the nose, mid-face, brows, chin, and jaw, which reduces the 
apparent prominence of the eyes and cheekbones; many of the features 
that distinguish adult from juvenile faces also distinguish male from 
female faces. 

Below we consider how facial attractiveness is related to age, fluctuat- 
ing asymmetry, averageness, and exaggerated feminine/neotenous fea- 
tures in samples of five populations. 


MATERIALS AND METHODS 


Sampled Populations 


We collected photographs of individuals of both sexes, together with 
interviews and anthropometric measurements of photographic subjects 
from three populations: undergraduates at the University of Michigan in 
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Ann Arbor; students at the Federal University of Bahia in Salvador, 
Brazil; and natives of several villages of Ache (or Guayaki) Indians in 
eastern Paraguay. We collected ratings of each set of photographs and 
conducted interviews with raters from a new sample of University of 
Michigan undergraduates; from middle and lower class adults in Sal- 
vador, Brazil; from natives of a different Ache village; from students at 
the Moscow Institute of the Humanities in Russia; and from Hiwi (or 
Cuiva) Indians in southern Venezuela. Data were collected by Doug 
Jones in the United States, Brazil, and Russia; by Kim Hill in Venezuela; 
and by both researchers in Paraguay. Data collection extended over 
several fieldwork sessions from 1990 to 1992. 

In the United States, photographic subjects and raters were recruited 
in introductory anthropology and psychology courses, and by flyers 
posted around campus. In Brazil and Paraguay, photographic equip- 
ment was set up in public areas, and interested individuals were invited 
to participate. Raters in Brazil, Paraguay, Russia, and Venezuela were 
recruited in part by going from door to door, and in part by approaching 
potential raters in public places. 

Photographic subjects and raters in the United States were largely of 
European ancestry. Although we also collected attractiveness ratings of 
and from Asian-Americans and African-Americans, in this paper we 
present data for European-Americans only, since a restricted sample 
provides a better test of some of our hypotheses. Brazilian subjects and 
raters overwhelmingly described themselves as being of mixed ancestry, 
mostly European and African, with some American Indian. In this paper 
we will use our Brazilian sample to test hypotheses about age, symme- 
try, averageness, and supernormal features, but there are additional 
complexities in the Brazilian situation that we can only touch on here 
(see the conclusion to the section on ‘Average Proportions” under 
Results). Russian raters were largely of Russian nationality, with some 
other nationalities of the former Soviet Union (Ukrainians, Jews, Ger- 
mans, Central Asians) represented as well. Culturally, the Ache and the 
Hiwi are the most divergent populations in our study. Both groups were 
foragers, with dogs being the only domesticated animals. Neither group 
had peaceable contact with the outside world until the 1960s, and both 
continue to have relatively little contact on a day-to-day basis with out- 
siders, and little exposure to Western media. 

Figures for age, sex, and number of subjects and raters are included in 
Table 1. 


Photographs, Interviews, and Ratings 


The senior author took photographs with a tripod-mounted Canon ES 
750 camera and Agfa color film of subjects seated against a white back- 
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Table 1. Age and Facial Attractiveness 
PHOTOGRAPHS PHOTOGRAPHS PHOTOGRAPHS 
Raters (Male, Female) OF FEMALES OF MALES 
BRAZILIANS n 51 23 
mean age 23.0 24.0 
age s.d. 3.76 3.34 
age range 17-34 19-32 
Regression Regression 
Correlation Slope Correlation Slope 
Brazilians (19,11) — .38** 13** —.40° —.17* 
U.S. Americans (12,20) — .07 — .03 - .36' —.15' 
Russians (11,14) -.22" —.10° —.10 — .04 
Ache Indians (11,13) —.18 — .07 —.17 — .07 
Hiwi Indians (4,4) —.10 —.05 —.54** .24"* 
U.S. AMERICANS 1 52 31 
mean age 20.1 21.3 
age s.d. 1.38 2.84 
age range 18-25 18-30 
Regression Regression 
Correlation Slope Correlation Slope 
Brazilians (20,23) ~.15 ~—.12 13 .07 
U.S. Americans (11,18) —.12 —.13 .18 12 
Russians (12,14) —.16 ~.17 .20 -10 
Ache Indians (20,21) — .16 —.14 — .05 — .03 
Hiwi Indians (0,0) na. n.a n.a. n.a. 
ACHE INDIANS n 41 42 
mean age 29.1 31.7 
age s.d. 10.54 12.96 
age range 14-51 16-60 
Regression Slope Regression Slope 
Segregated Mixed Segregated Mixed 
Brazilians (17,16) — .06 —.11" — .07 —.09** 
U.S. Americans (12,15) ~ .04 N.a. — .07 n.a. 
Russians (12,12) .04 — .05* — 06 —.10" 
Ache Indians (15,15) —.14 —.15** ~ .04 ~ .05* 
Hiwi Indians (7,4) —.13 n.a. — .02 — .02 


Note: Numbers in parentheses are number of raters that rated members of the opposite sex in 


that category. Correlation is Pearson product-moment; slope is slope of least-squares regression 
line. Negative correlations and regression slopes mean that attractiveness declines with age. See 
text for discussion of age-segregated and mixed Ache groups. 
Significance (two-tailed): * p < 0.1 
* p< 0.05 
“p< 0.01 
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drop, with their heads at a fixed distance and carefully positioned using 
an angle-meter. 

We got ratings of photographs by laying out nine photos in three rows 
in front of each rater and asking her, first, to sort each column by 
attractiveness, putting the most attractive face (Portuguese: rosto mais 
bonito; English: most attractive face; Russian: samoe krasivoe litso; Ache: 
cha’a gatuvi = best face; Hiwi: wohune/pehenowa = pretty/handsome face) 
at the top and the least attractive at the bottom, and second, to sort rows 
left to right by attractiveness. The result was to rank the photographs 
roughly from 1 to 9. For each rater we laid out 3 by 3 girds, selecting 
photos at random from a single population sample of the other sex, until 
we ran out. All but a few Ache and Hiwi, and all Brazilians, Americans, 
and Russians carried out the task without difficulties. 

A much greater range of ages was represented among the Ache, and 
our format for rating the photographs differed; in order to provide some 
control for age effects, we divided both our male and female Ache 
photographs into four subgroups, roughly segregated by age, and car- 
ried out rankings within subgroups. We also carried out a small number 
of rankings mixing together Ache from different subgroups in order to 
carry out a separate assessment of the effects of age on attractiveness. 

Although we did not explicitly ask raters to assess individuals in 
photographs as potential sexual or marital partners, many raters, espe- 
cially Ache and Brazilians of both sexes, made spontaneous comments 
along these lines. Ongoing research in Brazil by the senior author will 
assess to what extent perceived attractiveness has real-life social conse- 
quences. (And see Gangestad, in this issue, for material on conse- 
quences of physical attractiveness in the United States.) 


Measurement of Photographic 
Facial Landmarks 


We scanned each of our photographs with an Apple Scanner con- 
nected to a Macintosh II. We began by taking x,y coordinates of 60 
points on each face, but our tests of the face-averaging hypothesis are 
based on just 27 points—seven along the midline of the face and ten on 
either side—because remeasurement suggested that these 27 gave the 
most consistent results. The points we used are as follows: for the face, 
at the bottom of chin, junction of chin and lower lip, lower junctions of 
ear and cheek (right and left), lateralmost protrusion of cheekbone (both 
sides), and glabella; for the lips, bottom midpoint of lower vermilion, 
upper midpoint of lower vermilion (or lower midpoint of upper ver- 
milion), upper midpoint of upper vermilion, lateralmost extensions 
(both right and left); for the nose, bottom midpoint of nasal septum, 
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lateralmost extensions of nasal alae (both sides); for the eyes and eye- 
brows, upper and lower midlines of eyes, inside and outside corners of 
eyes, and innermost and outermost extensions of eyebrows (all three 
measured on both right and left sides). 

We calculated distances between each point and all other points along 
the midline and/or on the same side of the face. We eliminated some of 
the distances involving the eyebrows, since remeasurement suggested 
that our eyebrow landmarks measured vertical position better than 
horizontal. We were left with 247 distances between landmarks for 
each face. 


Statistics 


We use two different indices to measure how much each face in our 
six population samples differs from the average face in that sample. 
Changing the size of a face without changing its shape will not change 
the value of either index. The first is a slightly modified version of the 
pattern variability index (PVI) of Garn, LaVelle, and Smith (Garn et 
al. 1985). For each of our six samples, and for each of our 247 log- 
transformed distances between facial landmarks, we calculate the medi- 
an. Then, for each log-distance for each face, we calculate the difference 
from the sample median. Finally, for each face we calculate the standard 
deviation of the 247 differences. The higher the PVI, the more the face 
differs from the proportions of the median face in the sample. Thus, if all 
the measurements of a face are 10% larger than the median measure- 
ments in the population (1.1, 1.1, 1.1, . . .), the standard deviation of the 
logarithms of these measurements will be zero; the face is larger than 
average, but its proportions are those of the average face. But if one of the 
measurements is 10% larger than the median, one is 20% larger, one is 
10% smaller, and so on (1.1, 1.2, 0.9, . . .), then the standard deviation 
of the logarithms of those measurements will be greater than zero, 
reflecting the difference in proportions between the face being measured 
and the average face. 

The Euclidean Distance matrix index (EMDIJ) for each face is the 
logarithm of the ratio of the largest to the smallest differences from the 
sample median (Lele and Richtsmeier 1991). Again, if all the measure- 
ments of a face are 10% larger than average, then the logarithm of 
the ratio of the largest to the smallest of these differences will be 
log(1.1/1.1)=log(1)=0, whereas if any two landmark distances differ 
from their medians by different percentages, the resulting EDMI will be 
greater than zero. 

More measurement error is involved in the measurement of fluctuat- 
ing asymmetry than in the measurement of averageness, since differ- 
ences between the right and left side of one person’s face are typically 


Criteria of Facial Attractiveness in Five Populations 281 


smaller than differences between two different faces. When we re- 
measured faces, measurement error was less than 20% for all of our 247 
distance measurements, but more than 50% for most measurements of 
right/left distance differences. We selected just six measurements (from 
inside and outside corners of the eye to outside corner of lip, bottom 
midpoint of nasal septum, and chin) that seemed particularly reliable 
(measurement error less than 30%). For each measurement we sub- 
tracted left from right distances, divided by the mean of left and right, 
and subtracted the median left-right difference for that measurement 
and that sample (to correct for directional—i.e., nonfluctuating— 
asymmetry). Our index of fluctuating asymmetry (FA) is the average of 
these six numbers for each photograph. 

To test the hypothesis that femininity/neoteny is a component of 
female facial attractiveness, we began by comparing median distances in 
male and female photographic samples. When we looked at the ratio of 
median female measurements to median male measurements in each of 
our three populations of photographic subjects, the distances between 
landmarks around the eyes and eyebrows consistently produced some 
of the highest ratios (typically > 1), while vertical distances along the 
midline of the face consistently produced some of the lowest (typically < 
0.9). In other words, the size of the eyes in relation to the height of the 
face strongly distinguishes between females and males. The same ratio 
also distinguishes juvenile from adult faces (Enlow 1990; Farkas and 
Munro 1987), although we can’t show this with our data set. In this 
paper, we report the relation to attractiveness of the eye width to face 
height ratio (EW/FH)—the mean of right and left eye widths divided 
by the distance between the bottom of the chin and the glabella (the 
lower border of the forehead along the midline of the face, between 
the eyebrows). 

In our analyses, we sometimes present pooled data. Pooling of sam- 
ples was carried out by calculating z-scores within each sample, to allow 
for differences in sample means and standard deviations, and then 
pooling these scores to calculate correlations and significance levels. Ina 
few cases, we summarize our results by presenting average correlations, 
but in these cases we do not attempt significance tests. 


RESULTS 


Effects of Age 


Table 1 presents correlations between ages of photographic subjects 
and mean ratings of attractiveness for different combinations of photo- 
graphic subjects and raters. Since the range of ages represented in our 
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photographic samples differs, we also present slopes of least-squares 
regression lines to make it easier to compare different samples. 

The photographic samples from the United States have the most 
limited age ranges (18 to 25 for females, 18 to 30 for males). Not sur- 
prisingly, the correlations between age and attractiveness in these sam- 
ples are low. There is a greater range of ages in the Brazilian photo- 
graphic sample (17-34, 19-32), and correlations with attractiveness are 
consistently (and in some cases significantly) negative for both sexes. 

The format for the Ache photographic subjects (presented at the 
bottom of the tables) differs somewhat from that for the other two 
groups, since we have two sets of attractiveness ratings for each subject. 
The Ache photographs were divided into four subgroups, roughly seg- 
regated by age, and rankings reflect relative standing within a subgroup. 
The mixed rankings derive from all four groups combined and reflect 
standing relative to all other same-sex Ache in the sample. (See Mate- 
rials and Methods for further discussion.) For the Ache we present two 
columns of regression slope coefficients for each sex. The first is based 
on calculating the regression of attractiveness within subgroups on age 
separately for each of four subgroups. Table 1 gives the average of the 
resulting slopes. The second column is based on calculating the regres- 
sion of attractiveness on age for the mixed sample. The two different 
methods of estimating the effects of age on attractiveness among the 
Ache give generally similar results, although results from the mixed 
sample are consistently a little higher. 

For populations in which potential mates span a range of ages from 
adolescence to reproductive senility, physical differences associated 
with age are probably the most important determinants of physical 
attractiveness for both sexes. Our Ache sample spans this range, and for 
this sample age is a much stronger predictor of attractiveness than any 
of the other variables we will consider. This topic deserves a much more 
extended treatment than we can give in this paper. For present pur- 
poses, age-related differences must be controlled before we can look at 
possible non-age-related criteria of physical attractiveness. This presents 
problems because our estimates of the slopes of regression lines have 
considerable margin for error and show a great deal of variability be- 
tween photographic samples and raters. We approached this problem 
by calculating correlations in three ways: without controlling for age; 
controlling for age using the regression slopes presented in Table 1 for 
each combination of photographic subjects and raters (a Lilliefors test 
showed that none of the age and attractiveness distributions used in this 
regression were significantly different from a normal distribution); and 
controlling for age assuming first low (.05), then medium (.1), then high 
(.15) values for the regression slope across all combinations of photo- 
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graphic subjects and raters. With few exceptions (discussed below), our 
conclusions are quite robust in the face of different assumptions about 
the strength of the relationship between age and attractiveness. In all 
the tables that follow, we show correlations controlled for age using the 
regression slopes from the second assumption listed above. 


Intergroup Comparisons 


With age partialled out, how much do samples of different popula- 
tions of raters agree in their judgments of physical attractiveness? Table 
2 shows correlations in age-corrected ratings of physical attractiveness 
between samples of raters. (We use Pearson product-moment correla- 
tions because a Lilliefors test shows that the distributions of attractive- 
ness ratings do not depart significantly from normality.) In the Western 
cluster, which consists of raters from Brazil, the United States, and 
Russia, there is very strong and significant agreement on standards of 
physical attractiveness for all population samples rated. The average 
correlation between members of the Western cluster is .64. The average 
correlation for the Indian cluster, consisting of Ache and Hiwi raters, is 
.42. Finally, even in the cross-cluster comparisons there seems to be 
some agreement, since a number of correlations are significantly posi- 
tive and none are significantly negative. The average correlation across 
clusters is .13. 

These results have some implications for hypotheses about criteria of 
facial attractiveness. First, standards of attractiveness vary across popu- 
lations. This is certainly not news. Darwin (1981), Westermarck (1891), 
and Ellis (1942), relying on missionaries’ and travelers’ accounts, all 
reported such variation across populations, and modern ethnographers 
commonly second these reports (cf. Ford and Beach 1951). These results 
do not disprove the existence of specialized biological adaptations for 
assessing attractiveness, any more than linguistic variation disproves 
the existence of specialized biological adaptations for processing phonol- 
ogy and syntax. However, they do argue that theories of the psychology 
of attractiveness need to be tested across a range of cultures, if they are 
to have strong claims to be theories of human, and not merely industrial 
Western, psychology. 

Second, shared culture probably cannot completely account for sim- 
ilarities in standards of physical attractiveness across populations. 
Shared culture might be responsible in part for similarities in standards 
of attractiveness within the Western cluster on Table 2. But it is hard to 
see how similarities between Ache and Hiwi Indian standards could 
have anything to do with shared culture; Ache and Hiwi cultures have 
been developing independently for many thousands of years. On the 
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other hand, something like a Face Averaging Device (see Background 
and Theory) would make sense in this case; two physically similar 
groups like the Ache and Hiwi could have similar ideal composites even 
without culture contact. More complicated mechanisms—e.g., setting 
the ideal face equal to the average face subject to some transformation— 
would also work. 

Table 2 also presents Cronbach’s alphas for each combination of rater 
and photographic sample. (See diagonal matrix elements in italic bold- 
face.) These values set a lower bound to the reliability of our attractive- 
ness ratings. Thus the error in rating j is less than or equal to 1— qj. 
Since a > 0 for all attractiveness ratings, correlations of attractiveness 
with other variables in this and other tables underestimate the true 
correlations. 


Average Proportions 


Do raters perceive faces as especially attractive when these faces have 
proportions especially close to average proportions (with “average” 
meaning average for the population from which raters are drawn)? We 
have used a version of Garn, LaVelle, and Smith’s pattern variability 
index (PVI) and a Euclidean Distance matrix index (EDMI), derived from 
Lele and Richtsmeier (see above), to determine how closely the propor- 
tions of each face in our sample correspond to average proportions. 
Table 3 shows the correlations of log PVI and log EDMI with age- 
corrected attractiveness. (We take the logarithms of our indices to cor- 
rect for significant right skewness, as measured by a Lilliefors test for 
normality. We get virtually identical results using the original values of 
the indices. Log PVI and log EDMI are strongly and significantly corre- 
lated with each other for all photographic samples: average r = .8, range 
= .66-.87.) Negative correlations in Table 3 mean that deviant faces are 
less attractive. 

According to the face averaging hypothesis, we expect raters to be 
particularly attracted to faces especially close to the average in their own 
populations. The shaded areas indicate cases in which raters are rating 
members of their own population. For the pooled sample we pooled 
data within these boxes only. 

Table 3 provides strong support for the face averaging hypothesis 
only for ratings of Ache photographs, although trends for other photo- 
graphic samples are generally in the right direction. (Positive correla- 
tions for some of the ratings carried out by Hiwi women probably reflect 
the very small number of raters.) Neither of our indices of difference in 
facial proportions is consistently a better predictor than the other. 
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Why should the averageness effect be so much stronger for the Ache 
than for the other groups in our sample? And why are even non-Indian 
raters sometimes attracted to Ache faces close to average Ache propor- 
tions? Part of the answer may be that departures from average propor- 
tions are correlated with age for the Ache (r = .27, .27, .37, .14 for PVI 
and EDMI for males and females, respectively), so our estimate of the 
effect of nonaverage features on physical attractiveness is sensitive to 
our assumptions about the effects of age on attractiveness. (See “Effects 
of Age” above; this is the only finding in this paper that is greatly 
affected by our assumptions about the attractiveness-age regression.) 
Also, although the variability of Ache faces is about the same as that for 
the other groups (judging by averages and standard deviations of PVI 
and EDMI), the causes of variability are probably different. The range of 
ages is much greater in the Ache sample than in the others; conditions of 
life have been much harder for the Ache; and the Ache are probably 
more genetically homogeneous than our other samples. It is possible 
that departures from average features resulting from aging and a hard 
life detract more from attractiveness than departures resulting from 
genetic heterogeneity in modern multi-ethnic societies. Correcting for 
age can remove some of these effects, but (if people age at different 
rates) not all of them. 

Brazil may be something of a special case. In Table 3, where Brazilians 
are rating Brazilians, the correlations of attractiveness with PVI and 
EDMI for females and males are — .16, —.18, —.23, and —.26, respec- 
tively. Suppose we recalculate our indices using measurements of U.S. 
American rather than Brazilian faces as our standard. In other words, 
suppose we measure how much each Brazilian face differs from the 
average American face, rather than from the average Brazilian face. The 
corresponding correlations with attractiveness in this case are consis- 
tently stronger (—.21, —.26, —.26, and —.33). Brazilians seem to be 
evaluating Brazilian faces more by how closely they match U.S. propor- 
tions than by how closely they match Brazilian proportions! This proba- 
bly reflects both the influence of North American and European media, 
and a local class structure in which the rich are disproportionately 
European in ancestry and appearance and the poor are dispropor- 
tionately African. Hoetink (1967) suggests that in racially stratified soci- 
eties the standard of physical attractiveness (the “‘somatic norm image’’) 
will be weighted toward the physical appearance of the dominant group 
(cf. Franklin 1968; Russell et al. 1992); we hope to give this topic a more 
extended treatment in future publications. 
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Fluctuating Asymmetry 


Table 4 presents data on the relationship between facial fluctuating 
asymmetry and facial attractiveness. The results are unimpressive. 
Among all our samples of raters, only Russian women show a really 
significant attraction to faces with low FA. These results differ from 


Table 4. Fluctuating Asymmetry and Facial Attractiveness 


PHOTOGRAPHS PHOTOGRAPHS PHOTOGRAPHS 
Raters (Male, Female) OF FEMALES OF MALES 
BRAZILIANS n 51 23 
average FA 0053 .0052 
s.d. FA .0031 .0036 
Brazilians (19,11) -.15 —.22 
U.S. Americans (12,20) — .08 —.27 
Russians (11,14) = '25" .07 
Ache Indians (11,13) 14 .08 
Hiwi Indians (4,4) — .03 .26 
U.S. AMERICANS Nn 52 31 
average FA -0059 .0052 
s.d. FA .0036 .0021 
Brazilians (20,23) .07 —.12 
U.S. Americans (11,18) — .01 .07 
Russians (12,14) —.10 — .23 
Ache Indians (20,21) — .07 — .03 
Hiwi Indians (0,0) na. Nn.a. 
ACHE INDIANS n 41 42 
average FA -0052 0071 
s.d. FA .0020 .0038 
Brazilians (17,16) — .07 — .04 
U.S. Americans (12,15) — .09 —.10 
Russians (12,12) — .23t 03 
Ache Indians (15,15) 03 -.12 
Hiwi Indians (7,4) —.03 05 
POOLED SAMPLES 1 145 96 
Brazilians (56,50) — .05 —.11 
U.S. Americans (35,53) — .06 — .09 
Russians (35,40) —.19* — .04 
Ache Indians (46,49) .03 — .04 
Hiwi Indians (7,5) — .02 12 


Note: Numbers are Pearson product-moment correlations between average ratings of 
attractiveness and logarithms of a measure of fluctuating asymmetry (FA), with age 
partialled out. Negative correlations mean that attractiveness declines with increasing 
asymmetry. 

Significance (one-tailed): * p < .01 

* p< 0.05 
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forthcoming results discussed by Thornhill and Gangestad in this issue, 
although measurement error in FA (discussed above) may weaken our 
test. 

FA is weakly correlated with PVI and EDMI (pooled r = .11, .22 for 
females; .05, .09 for males). Controlling for PVI and EDMI slightly 
lowers the correlations of FA with attractiveness, but controlling for FA 
has virtually no effect on the correlation of PVI and EDMI with attrac- 
tiveness. FA is not significantly correlated with age among the Ache. 


Supernormal Stimuli 


As discussed above, recent research and theory suggest that prefer- 
ences for exaggerated, or “supernormal,” stimuli may sometimes be 
adaptive by-products rather than adaptations in their own right. Selec- 
tion for choosing a mate of the right species, sex, and age may lead 
incidentally to nonadaptive biases in mate choice among those individu- 
als who fall within the ‘right’ species, sex, and age class. 

Table 5 presents correlations between a measure of femininity/ 
neoteny of facial features—the eye width to face height ratio—and 
ratings of attractiveness. In ratings of both Brazilian and U.S. females a 
high EW/FH ratio (an exaggerated feminine/neotenous feature) is consis- 
tently and often strongly and significantly correlated with attractive- 
ness. No such correlation is present for ratings of Ache females. In spite 
of the anomalous results for ratings of Ache females (discussed below), 
pooling results for different populations of photographic subjects shows 
that men in every population of raters give women higher ratings for 
attractiveness when they have a higher EW/FH. 

There is no strong or consistent effect of EW/FH on female assess- 
ments of male attractiveness. Correlations of EW/FH with PVI and EDMI 
are weak and inconsistent. Correlation of EW/FH with age among the 
Ache is —.16 for females and —.52 for males. 


DISCUSSION 


To summarize, we find strong support for the hypothesis that increased 
age is associated with declining facial attractiveness for adults of both 
sexes, at least in samples that span a wide range of ages. We find strong 
support with our Ache photographic samples and weak support with 
other photographic samples for the hypothesis that facial proportions 
close to the population average are associated with increased attractive- 
ness. We find little support for the hypothesis that increased fluctuating 
asymmetry in facial proportions is associated with declining physical 
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Table 5. Eye Width to Face Height Ratio and Facial Attractiveness 


PHOTOGRAPHS PHOTOGRAPHS PHOTOGRAPHS 
Raters (Male, Female) OF FEMALES OF MALES 
BRAZILIANS n 49 20 
average EW/FH .21 18 
EW/FH s.d. .014 .018 
Brazilians (19,11) 39%" .05 
U.S. Americans (12,20) 19 16 
Russians (11,14) .31* .45* 
Ache Indians (11,13) A3"* .09 
Hiwi Indians (4,4) 26" 42" 
U.S. AMERICANS n 51 31 
average EW/FH .23 .20 
s.d. EW/FH .018 .013 
Brazilians (20,23) .23* .00 
U.S. Americans (11,18) 34* — .03 
Russians (12,14) 25° .07 
Ache Indians (20,21) 37" —.11 
Hiwi Indians (0,0) n.a. n.a. 
ACHE INDIANS n 41 36 
average EW/FH .19 .18 
s.d. EW/FH .013 .013 
Brazilians (17,16) —.12 22 
U.S. Americans (12,15) —.10 18 
Russians (12,12) -.21 .00 
Ache Indians (15,15) .02 15 
Hiwi Indians (7,4) 18 —,44"* 
POOLED SAMPLES 1 145 96 
Brazilians (56,50) .18* 1 
U.S. Americans (35,53) .16* 11 
Russians (35,40) .13* 13 
Ache Indians (46,49) .29%* .05 
Hiwi Indians (7,5) .22* —.14 


Note: Numbers are Pearson product-moment correlations between average ratings of 
attractiveness and logarithms of a measure of fluctuating asymmetry (FA), with age 
partialled out. Negative correlations mean that attractiveness declines with increasing 
asymmetry. 

Significance (one-tailed for females; two-tailed for males): 

t p< .0l 

* p< 0.05 

“p< 0.01 


attractiveness (although measurement error may have weakened our 
test). We find support in ratings of Brazilian and U.S. females, but not in 
ratings of males or Ache females, for the hypothesis that exaggerated 
feminine/neotenous traits are particularly attractive. 
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How strong are our results? Not very, if we measure them as a 
percentage of variance explained (r). This is probably due in part to 
error variance in average attractiveness ratings (see above discussion of 
within-sample reliability as measured by Cronbach's alphas) and other 
variables. Our method of ranking photographs, our use of photographic 
rather than direct facial measurements, and attenuation of the range of 
attractiveness owing to the reluctance of unattractive individuals to 
participate are all possible sources of noise in our data. 

But there is another way to gauge the strength of the effects of 
averageness, fluctuating asymmetry, and femininity/neoteny on attrac- 
tiveness; we can compare them with the effects of age. For Ache fe- 
males, one year’s decline in physical attractiveness corresponds roughly 
to a .1 “point” decline in average attractiveness. (This is an average 
figure for all reproductive age females; obviously the slope may vary as a 
function of age. The figure would be a bit lower for Ache males. We cite 
figures for Ache because they have the widest age range.) One standard 
deviation (s.d.) in attractiveness averages around 1.5 points. A correla- 
tion of .34 between the eye width to face height ratio and attractiveness 
(the figure for U.S. males rating U.S. females) means that a 1 s.d. 
decrease in EW/FH corresponds to a .34 s.d. decrease in attractiveness, 
i.e., about .5 points or 5 years’ worth. In other words, for a rough 
estimate of the ‘years’ worth” of attractiveness corresponding to a1 s.d. 
decrease in the relevant variable, multiply the correlation coefficients in 
Tables 3, 4, and 5 by 15. The exact numbers should not be taken to heart, 
but using attractiveness-year-equivalents as a benchmark may give more 
of an intuitive handle on the approximate magnitudes of the effects 
involved. 


Adaptive Preferences or Adaptive 
By-products? 


The effects of age on facial attractiveness are familiar and expected, 
and they probably have a direct adaptive explanation. The effects (or 
lack thereof) of symmetry, averageness, and neoteny/femininity are less 
commonsensical and may have some bearing on the theoretical dispute 
about the extent to which mating preferences are adaptations or by- 
products of adaptation. We began this paper by comparing two theories 
of the origins of standards of physical attractiveness that seemed espe- 
cially relevant to facial attractiveness. One of these theories, that attrac- 
tive features are markers of developmental homeostasis, was partly 
supported for our face samples—with positive but mostly insignificant 
results for fluctuating asymmetry, and both positive and significant 
results for averageness. However, our analysis also shows that depar- 
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tures from average proportions increase with increasing age in the one 
group, the Ache, that has a large range of ages, so it is possible that 
departures from average proportions seem unattractive as much be- 
cause they are signs of aging as because they are signs of non-age- 
related genetic or epigenetic load. (It would be interesting to know 
whether the composite facial images of Langlois and colleagues and of 
Benson and Perret look younger than the individual faces that went into 
the composite.) 

Our results are also consistent with the arguments of Williams (1992) 
and Enquist and Arak (1993) that selection for choosing a mate of a 
particular sex and age class may lead to a nonadaptive sensory bias for 
supernormal stimuli within sex and age classes. We find no evidence in 
our data set that EW/FH has any relationship to such variables as age at 
menarche, stature, or feminine body proportions (waist to hip ratio), 
which have arguably been relevant to fitness in our evolutionary past 
(see Singh, this issue); thus there may be no adaptive advantage for 
males to prefer feminine/neotenous features over and above the advan- 
tages of choosing mates of a particular sex and age. More research, 
looking at more variables, will be needed to settle whether facial femi- 
ninity, masculinity, and neoteny are associated with components of 
mate value other than age. (See Thornhill and Gangestad in this issue on 
exaggerated facial proportions as possible markers of immunocompe- 
tence.) 


Femininity, Neoteny, Averageness, and Aging 


We noted above that many of the same features (including a high EW/ 
FH) distinguish both juvenile from adult faces and female from male 
faces, which suggests two different explanations of why these features 
seem attractive. The explanation offered by Cunningham (1986), 
McArthur and Berry (1983), and Ried] (1990)—that it is neoteny or 
juvenility that makes facial features attractive—faces the difficulty that 
skeletal growth slows down greatly by early adulthood. A high eye 
width to face height ratio distinguishes juveniles from adults more 
effectively than it distinguishes young adults from old ones (Behrents 
1985; Enlow 1990). It may be that EW/FH is most important in determin- 
ing physical attractiveness among females in adolescence and early 
adulthood. For younger females, the most important cues may be 
puberty-related changes in body shape, while for older females (and 
older males) changes in soft tissue—wrinkles and sagging skin—may be 
more important age cues than changes in hard tissue. This difference 


Criteria of Facial Attractiveness in Five Populations 293 


could explain why we found no effect of EW/FH, but a strong effect of 
nonaverage facial proportions, on the attractiveness of Ache women 
{and men), who were older on average than our non-Ache samples. In 
other words, there may be an interaction with age in the effects of 
averageness and femininity/neoteny on facial attractiveness—although 
our current data set is too small to test this possibility properly. 

The other interpretation is that attractive female facial features are 
exaggerated sex-typical features. The two interpretations are not mutu- 
ally exclusive. Simultaneous selection for sex of mate (female) and age of 
mate (young but reproductively mature) could jointly result in an inci- 
dental preference for feminine/neotenous traits within a single sex and 
age class. (Where female standards of male attractiveness are concerned, 
simultaneous selection for sex of mate and age of mate might give more 
complicated and ambiguous results.) Our research suggests the impor- 
tance of ‘“‘supernormal”’ features in facial attractiveness in real popula- 
tions, but further research, perhaps using artificial stimuli, will be 
needed to bring out the more subtle features of the ‘‘ideal’’ face in 
different populations. 
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